\documentclass{article}
\usepackage{times}
\usepackage[paperwidth=6.5in, paperheight=9in]{geometry}
\usepackage{amsmath}
\title{The Young Modeler's Handbook}
\author{David V. Pynadath}
\begin{document}
\maketitle
\begin{quotation}
\noindent
\textit{I never satisfy myself until I can make a mechanical model of a thing. If I can make a mechanical model I can understand it.}

\hfill \textsc{Lord Kelvin}
\end{quotation}
\clearpage
\section{Worlds}
\begin{quotation}
\noindent\textit{O brave new world,\\
That has such people in't.}

\hfill\textsc{The Tempest, Act 5, Scene 1}
\end{quotation}

This is a world.

\begin{verbatim}
from psychsim.world import World

world = World()
\end{verbatim}

A world has people.

\begin{verbatim}
from psychsim.agent import Agent

rufus = Agent(Rufus)
world.addAgent(rufus)
\end{verbatim}

And groups of people.

\begin{verbatim}
free = Agent('Freedonia')
world.addAgent(free)
\end{verbatim}

And killer robots.

\begin{verbatim}
world.addAgent(Agent('Klaatu'))
\end{verbatim}

These are \textit{agents}, because they have names.

\begin{verbatim}
for name in world.agents.keys():
   print world.agents[name]
\end{verbatim}

But names are just string labels, so they're not a good reason to make agents special.

\subsection{State}
The {\em state} of a simulation captures the dynamic process by which it evolves. As in a {\em factored MDP}, we divide an overall state vector, $\vec S$, into a set of orthogonal features, $S_0\times S_1\times\cdots\times S_n$. Each feature captures a distinct aspect of the simulation state. 

\begin{verbatim}
from psychsim.pwl import *
s = KeyedVector({'S_0': 0.3, 'S_1': 0.7})
s['S_n'] = 0.4
for key in s.keys():
   print key,s[key]
\end{verbatim}

Notice that PsychSim allows you to refer to each feature by a meaningful {\em key}, as in Python's dictionary keys. Keys are treated internally as unstructured strings, but you may find it useful to make use of the the following types of structured keys.

\subsubsection{Unary Keys}
It can be useful to describe a state feature as being local to an individual agent. Doing so does {\em not} limit the dependency or influence of the state feature in any way. However, it can be useful to define state features as local to an agent to make the system output more readable. For example, the following defines a state feature ``troops'' for the previously defined agent ``Freedonia''.

\begin{verbatim}
world.defineState(free.name,'troops')
\end{verbatim}

For state features which are not local to any agent, a null agent name (i.e., {\tt None}) indicates that the state feature will pertain to the world as a whole. Again, from the system's perspective, this makes no difference, but it can be useful to distinguish global and local state in presentation.

\begin{verbatim}
world.defineState(None,'treaty)
\end{verbatim}

Thus, one reason for creating an agent is to group a set of such state features under a common name, as opposed to leaving them as part of the global world state.

\subsubsection{Binary Keys}
There can also be state that pertains to the {\em relationship} between two agents. 

\begin{verbatim}
world.defineRelation(free.name,sylv.name,'trusts')
\end{verbatim}

The order in which the agents appear in this definition {\em does} matter, as reversing the order will generate a reference to a different element of the state vector.

\section{Actions}
The most common reason for creating an agent is because there is an entity that can take {\em actions} that change the state of the world. If an entity has a deterministic effect on the world, you can define a single action for it. However, agents typically have multiple actions, and it is the decision among them that is the focus of the simulation.

\subsection{Atomic Actions}

The {\tt verb} of an individual action is a required field when defining the action. 

\begin{verbatim}
freeReject = free.addAction({'verb': 'reject'})
\end{verbatim}

The action created will also have a {\tt subject} field, representing the agent who is performing this action. The {\tt subject} field is automatically filled in with the name of the agent. A third optional field, {\tt object}, can represent the target of the specific action.

\begin{verbatim}
freeBattle = free.addAction({'verb': 'attack','object': sylv.name})
\end{verbatim}

You are free to define any other fields as well to contain other parameterizations of the actions.

\begin{verbatim}
free.addAction({'verb': 'offer','object': sylv.name,'amount': 50})
\end{verbatim}

We will describe the use of these fields in Section \ref{sec:dynamics}.

\subsection{Action Sets}

Sometimes an agent can make a decision that simultaneously combines multiple actions into a single choice.

\begin{verbatim}
freeRejectAndAttack = free.addAction(set([{'verb': 'attack','object': sylv.name},
                                          {'verb': 'reject'}]))
\end{verbatim}

For the purposes of the agent's decision problem, this option is equivalent to a single atomic action (e.g., {\tt reject-and-attack}). However, as we will see in Section \ref{sec:dynamics}, separate atomic actions can sometimes simplify the definition of the effects of such a combined action.

When inspecting an agent's action choice, each option is an {\tt ActionSet} instance, supporting all standard Python {\tt set} operations.

\begin{verbatim}
for action in free.actions:
   print len(action)
freeRejectAndAttack = freeReject | freeBattle
\end{verbatim}

As we will see in Section \ref{sec:legality}, not all of the agent's choices may be legal under all circumstances. Rather than inspecting the {\tt actions} attribute itself, we typically examine the context-specific set of action choices instead.

\begin{verbatim}
for vector in world.state.domain():
   print free.getActions(vector)
\end{verbatim}

\section{Piecewise Linear (PWL) Functions}

\subsection{Variables}
By default, a state feature is assumed to be real-valued, in $[-1,1]$. However, both the ``defineState'' and ``defineRelation'' methods take optional arguments to modify the valid ranges of the feature. The range of possible values does not affect the execution of the simulation, as simulation normalizes the values back to the $[-1,1]$ range during decision-making.

\begin{verbatim}
world.defineState(free.name,'troops',lo=0,hi=50000)
\end{verbatim}

It is also possible to specify that a state feature take on only integral values with the optional ``domain'' argument.

\begin{verbatim}
world.defineState(free.name,'troops',int,lo=0,hi=50000)
\end{verbatim}

One can also define a boolean state feature using the same argument, in which case the optional ``lo'' and ``hi'' arguments are obviously moot.

\begin{verbatim}
world.defineState(None,'treaty,bool)
\end{verbatim}

It is also possible to define an enumerated list of possible state features.

\begin{verbatim}
phases = ['offer','respond','rejection','end','paused',
          'engagement']
world.defineState(None,'phase',list,phases)
\end{verbatim}

The domains also work within ``defineRelation'' calls as well. In other words, relationships can take on the same sets of values as unary state features.

\subsection{Legality}\label{sec:legality}

\begin{verbatim}
tree = makeTree({'if': equalRow(stateKey(None,'phase'),'offer'),
                 True: True,    
                 False: False})
free.setLegal(action,tree)
\end{verbatim}

\subsection{Dynamics}\label{sec:dynamics}

\subsection{Termination}

{\em Termination} conditions specify when scenario execution should reach an absorbing end state (e.g., when a final goal is reached, when time has expired). A termination condition is a PWL function (Section \ref{sec:pwl}) with boolean leaves.

\begin{verbatim}
world.addTermination(makeTree({'if': trueRow(stateKey(None,'treaty')),
                               True: True, False: False}))
\end{verbatim}

This condition specifies that the simulation ends if a ``treaty'' is reached. Multiple conditions can be specified, with termination occurring if any condition is true.


\section{Reward}

\begin{verbatim}
    goalFTroops = maximizeFeature(stateKey(free.name,'troops'))
    free.setReward(goalFTroops,1.)
    goalFTerritory = maximizeFeature(stateKey(free.name,'territory'))
    free.setReward(goalFTerritory,1.)
\end{verbatim}

\section{Models}

A {\em model} in the PsychSim context is a potential configuration of an agent that may apply in certain worlds or decision-making contexts. All agents have a ``True'' model that represents their real configuration, which forms the basis of all of their decisions during execution. 


It also possible to specify alternate models that represent perturbations of this true model, either to represent the dynamics of the agent's configuration or to represent the perceptions other agents have of it.

\begin{verbatim}
free.addModel('friend')
\end{verbatim}

\subsection{Model Attribute: {\tt static}}

\section{Observations}

\section{PWL functions}\label{sec:pwl}

\section{Executing a Scenario}

\begin{verbatim}
world.step()
\end{verbatim}

\section{Algorithms}

\subsection{Decision Making}

\subsection{Belief Update}

World has $\Pr(M_0)$

World generates an observation $\omega$

But $M_0$ cannot generate $\omega$

\begin{align}
\Pr(M_1=m)&=\Pr(M_1=m|M_0,\omega)\\
&=\frac{\Pr(M_1=m,\omega|M_0)}{\Pr(\omega|M_0)}\\
&\propto 
\end{align}

And done.

\end{document}
